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We consider the problem of estimating the distribution function, 
the density and the hazard rate of the (unobservable) event time in 
the current status model. A well studied and natural nonparametric 
estimator for the distribution function in this model is the nonpara- 
metric maximum likelihood estimator (MLE). We study two alterna- 
tive methods for the estimation of the distribution function, assuming 
some smoothness of the event time distribution. The first estimator 
is based on a maximum smoothed likelihood approach. The second 
method is based on smoothing the (discrete) MLE of the distribu- 
tion function. These estimators can be used to estimate the density 
and hazard rate of the event time distribution based on the plug-in 
principle. 

1. Introduction. In survival analysis, one is interested in the distribution 
of the time it takes before a certain event (failure, onset of a disease) takes 
place. Depending on exactly what information is obtained on the time X 
and the precise assumptions imposed on its distribution function Fq, many 
estimators for Fq have been defined and studied in the literature. 

When a sample of Xj's is directly and completely observed, one can es- 
timate Fq under various assumptions. In the parametric approach, one as- 
sumes Fq to belong to a parametric class of distributions, e.g., the exponential- 
or Weibull distributions. Then estimating Fq boils down to estimating a 
finite-dimensional parameter and a variety of classical point estimation pro- 
cedures can be used to do this. If one wishes to estimate Fq fully non- 
par ametrically, so without assuming any properties of Fq other than the 
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basic properties of distribution functions, the empirical distribution func- 
tion F„ of Xi, . . . ,Xn is a natural candidate to use. If the distribution 
function is known to have a continuous derivative /o w.r.t. Lebesgue mea- 
sure, one can use kernel estimators [see, e.g., Silverman (1986)] or wavelet 
methods [see, e.g., Donoho and Johnstone (1995)] for estimating /q. Finally, 
in case Fq is known to satisfy a certain shape constraint as concavity or 
convex-concavity on [0,oo), a shape-constrained estimator for Fq can be 
used. Problems of this type were considered in, e.g., Bickel and Fan (1996), 
Groeneboom, Jongbloed and Wellner (2002) and Diimbgen and Rufibach 
(2009). 

However, in many cases the variable X is not observed completely, due to 
some sort of censoring. Parametric inference in such situations is often not 
really different from that based on exactly observed Xj's. The parametric 
model for X basically transforms to a parametric model for the observable 
data and the usual methods for parametric point estimation can be used to 
estimate Fq. For various types of censoring, also nonparametric estimators 
have been proposed. In the context of right-censoring, the Kaplan-Meier 
estimator [see Kaplan and Meier (1958)] is the (nonparametric) maximum 
likelihood estimator of Fq. It maximizes the likelihood of the observed data 
over all distribution functions, without any additional constraints. Density 
estimators also exist in this setting, see, e.g., Marron and Padgett (1987). 
Huang and Zhang (1994) consider the MLE for estimating Fq and its density 
in this setting under the assumption that Fq is concave on [0,oo). 

The type of censoring we focus on in this paper, is interval censoring, case 
I. The model for this type of observations is also known as the current status 
model. In this model, a censoring variable T, independent of X, is observed as 
well as a variable A = 1{x<t}) indicating whether the (unobservable) X lies 
to the left or to the right of the observed T. For this model, the (nonparamet- 
ric) maximum likelihood estimator is studied in Groeneboom and Wellner 
(1992). This estimator is discrete and is therefore not suitable for estimating 
the density /o, the hazard rate Aq = /o/(l — -^o) or the transmission potential 
which depend on the hazard rate Aq studied in Keiding (1991). An estimator 
that can be used to estimate these quantities is the maximum likelihood es- 
timator studied by Diimbgen, Freitag-Wolf and Jongbloed (2006) under the 
constraint that F is concave or convex-concave. 

In this paper, we study two likelihood based estimators for Fq (and its 
density /q and hazard rate Aq) based on interval censored data from Fq un- 
der the assumption that Fq is continuously differentiable. The first estimator 
we study is a so-called maximum smoothed likelihood estimator (MSLE) as 
studied by Eggermont and LaRiccia (2001) in the context of monotone and 
unimodal density estimation. It is a general likelihood-based M-estimator 
that will turn out to be smooth automatically. The second estimator we con- 
sider, the smoothed maximum likelihood estimator (SMLE), is obtained by 
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convolving the (discrete) MLE of Groeneboom and Wellner (1992) with a 
smoothing kernel. These different methods result in different but related es- 
timators. Analyzing the pointwise asymptotics shows that only the biases of 
these estimators differ while the variances are equal. We cannot say that one 
estimator is uniformly superior to the other. In a somewhat analogous way, 
Mammen (1991) studies the differences between the efficiencies of smoothing 
of isotonic estimates and isotonizing smooth estimates. This also does not 
produce a clear "winner." 

The outline of this paper is as follows. In Section 2, we introduce the cur- 
rent status model and review some results needed in the sequel. The MSLE 
for Fq based on current status data is introduced and characterized 
in Section 3. Moreover, asymptotic results are derived for F^^ as well as 
its density and hazard rate X^^, showing that the rate of convergence 
of F^^ is faster than the rate of convergence of the MLE. In Section 4, 
the SMLE for Fq, /q and Aq are introduced and their asymptotic proper- 
ties derived. The resulting asymptotic distributions are very similar to the 
asymptotic distributions of the MSLE. In Section 5, we briefly address the 
problem of bandwidth selection in practice. We also apply these methods to 
a data set on hepatitis A from Keiding (1991). Technical proofs and lemmas 
can be found in the Appendix. 

2. The current status modeL Consider an i.i.d. sequence Xi,X2, ■ ■ ■ with 
distribution Fq on [0, oo) and independent of this an i.i.d. sequence Ti, T2, . . . 
from a distribution G with Lebesgue density g on [0,oo). Based on these 
sequences, define Zi = {Ti,l^Xi<Ti}) ='■ (7i,A,j). Then Zi,Z2,... are i.i.d. 
and have density fz with respect to the product of Lebesgue- and counting 
measure on [0,oo) x {0,1}: 

fz{t, 6) = g{t){6FQit) + (1 - 6){l - Fo(t))} 

(2.1) 

= 6gi{t) + {l-5)goit). 

One usually says that the Xj's take their values in the hidden space [0,oo) 
and the Zi take their values in the observation space [0,oo) x {0, 1}. 

Let ¥n be the empirical distribution of Zi , . . . , Z„ . Writing down the log 
likelihood as a function of F and dividing by n, we get 

(2.2) 1{F) = J{6logF{t) + {l-5)log{l-F{t))}dFn{t,6). 

Here, we ignore a term in the log likelihood that does not depend on the 
distribution function F. 

In Groeneboom and Wellner (1992), it is shown that the (nonparametric) 
maximum likelihood estimator (MLE) is well defined as maximizer of (2.2) 
over all distribution functions and that it can be characterized as the left 
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derivative of the greatest convex minorant of a cumulative sum diagram. 
To be precise, the observed time points Tj are ordered in increasing order, 
yielding T(i) < T(2) < • • • < 7'(n)) and the A associated with r(j) is denoted 
by A(j). Then the cumulative sum diagram consisting of the points 



^'o = (0,0), 




is constructed. Having determined the greatest convex minorant of this dia- 
gram, F„(T(j)) is given by the left derivative of this minorant, evaluated at 
the point Pi. At other points it is defined by right continuity. Denoting by 
Gn the empirical distribution function of the Tj's and by the empirical 
subdistribution function of the Tj's with = 1, observe that for <i <n, 
Pi = (G„(r(j)), Gn,i(T'(,j))). Also note that Fn is a step function of which the 
set of jump points {ri, . . . , Tm} is a subset of the set {Tj -.1 <i <n}. 

Groeneboom and Wellner (1992) show that this MLE is a consistent es- 
timator of Fq, and prove that under some local smoothness assumptions, 
for t > fixed, n^^'^{Fn{t) — Fo(t)) has the so-called Chernoff distribution as 
limiting distribution. If Fq and G are assumed to satisfy conditions (F.l) and 
(G.l) below Groeneboom and Wellner (1992) also prove (see their Lemma 
5.9 and page 120) 

(2.3) \\Fo-Fn\\oc = Op{n~^/^logn), 

(2.4) max |rj+i - Ti\ = Op (n~^/^ logn). 

l<j<m 



(F.l) Fq has bounded support Sq = [0,Mq] and is strictly increasing on Sq 

with density /o, strictly staying away from zero. 
(G.l) G has support Sq = [0, oo), is strictly increasing on Sq with density 

g staying away from zero and g' is bounded on Sq. 

From this, it follows that for fixed t > 0, any u > and Zt = [t — I'jt + v] 

(2.5) sup|Fo(n)-F„(n)| =Op(n-i/3logn), 

(2.6) max |ri+i — Tj| = Op(n~"'^/^ logn). 

If one is willing to assume smoothness on Fq and use this in the estimation 
procedure, this cube-root-n rate of convergence of the estimator can be 
improved. The two estimators of Fq we define, do indeed converge at the 
faster rate v?/^ . 
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3. Maximum smoothed likelihood estimation. In this section, we define 
the maximum smoothed hkehhood estimator (MSLE) for the unknown 
distribution function Fq of the variable of interest X. We characterize this 
estimator as the derivative of the convex minorant of a function on M and 
derive its pointwise asymptotic distribution. Based on F^^, estimators for 
the density /o as well as for the hazard rate Aq = /o/(l — -^o) are defined 
and studied asymptotically. 

We start with defining the estimators. Define the empirical subdistribu- 
tion functions based on the Tj's with Aj = and 1, respectively, by 

1 " 

Gn,i{t) = - ^l[Q^t]x{i}{Tj,Aj) for i = 0,1, 
i=i 

and note that the empirical distribution of the data {Zj = (Tj,Aj) : 1 < j < 
n} can be expressed as dP„(t, 6) = 6dGn,i{t) + {1 — 6) dGnfiit)- Let Gn,i and 
Gn,o be smoothed versions of G„^i and G„_0) respectively (e.g., via kernel 
smoothing), let c/n^i and gn,o be their densities w.r.t. Lebesgue measure on 
[0, oo) and define dPn{t, 6) = 5dGn,i{t) + (1 — 5) dGnfl{t). This is a smoothed 
version of the empirical measure IPri, where smoothing is only performed "in 
the t-direction." Following the general approach of Eggermont and LaRiccia 
(2001), we replace the empirical distribution P„ in the definition of the log 
likelihood (2.2) by this smoothed version P„, and define the smoothed log 
likelihood on the class of all distribution functions by 

/^(F) = / {6\ogF{t) + (1 - 5)log(l - Fit))}dPnit,5) 

(3.1) 

= j logil-F{t))dGn,oit) + j logFit)dGn,iit). 

The maximizer of the smoothed log likelihood is characterized similarly 
as the maximizer of the log likelihood. The next theorem makes this precise. 



Theorem 3.1. Define Gn(t) = Gn,o{t) + Gn,i(t) for t>0 and consider 
the following parameterized curve in M^, a continuous cumulative sum dia- 
gram (CCSD): 

(3.2) i^(G„(t),G„,i(t)), 

for t £ [0,t], with r = sup{t > 0:gn,o{t) + gn,i(i) > 0}. Let F^^{t) be the 
right- continuous slope of the lower convex hull of the CCSD (3.2), evaluated 
at the point with x-coordinate Gn{t). Then F^^ is the unique maximizer 
of (3.1) over the class of all sub- distribution functions. We call F^^ the 
maximum smoothed likelihood estimator of Fq . 
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In the proof of Theorem 3.1, we use the following lemma, a proof of which 
can be found in the Appendix. 

Lemma 3.2. Let be defined as in Theorem 3.1. Then for any dis- 
tribution function F , 

log F{t)dGnAt)< j F^'^{t)\ogF{t)dGn{t) 
and 

j log{l-F{t))dGn,o{t)< J {l-F^^it))log{l-F{t))dGnit) 

with equality in case F = F^^ . 

Proof of Theorem 3.1. Use the equality part of Lemma 3.2 to rewrite 
(3.1) as 

l'{F^') = I (^rWlog^rW + (1 - F^^mog{l-F^Ht)))dGUt). 

By the inequality part of Lemma 3.2, we get for each distribution function 
F that 

/^(F)< lF::'^it)logFit)dGnit) + lil-F^^{t))\ogil-Fit))dGnit). 

Now note, using the convention • oo = 0, that for all p,p' £ [0, 1] 

(3.3) plogp' + (1 -p)log(l -p') <_plogp + (1 -p)log(l -p). 

This implies that 1^{F) < 1^{F^^), i.e., is maximal for F^^. 

For uniqueness, note that inequality (3.3) is strict whenever p' ^ p. The 
last step in the preceding argument then shows that 1^{F) < 1^{F^^), unless 
F = F^^ a.e. w.r.t. the measure dGn- It could be that dGn has no mass 
on [a,b] for some a <b, i.e., {Gn{t),Gn,i{t)) = {Gn{a),Gn,i{a)) for all t G 
[a, b] . This means that F^^ is constant on [a, b] . Furthermore, it holds that 
F{a) = F^^{a) and F{b) = F^^(6), implying that F is also constant and 
equal to F^^ on [a, 6] a.e. w.r.t. the Lebesgue measure on [0,oo). Hence, 
1^{F) < l^iF^^) unless F = F^^ . □ 

We assume the estimators Gn,i are continuously differentiable, hence, 
F^^ is continuous and its derivative exists. So we can define the maximum 
smoothed likelihood estimators for /o and Aq by 



(3.4) f^^t) = ^F^^{u] 



u=t 



l-FMS(i) 
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for t > such that F.^^{t) < 1. 

In Theorem 3.1 no particular choice for Gnfi and Gn,i was made. For what 
follows, we define these estimators explicitly as kernel smoothed versions of 
Gn,o and G„^i. Let k he a probability density satisfying condition (K.l). 

(K.l) The probability density k has support [—1, 1], is symmetric and twice 
continuously differ entiable on M. 

Note that condition (K.l) implies that m2{k) = f v?k{u) du < oo. 

Let K be the distribution function with density k, i.e., K{t) = f!i^ k{u) du, 
k' be the derivative of k and /i > be a smoothing parameter (depending on 
n). Then we use the following notation for the scaled version of K, k and k' 

(3.5) Kh(u) = K(u/h), kh{u) = —k(u/h) and kUu) = —rk' (u/h). 

h Ai^ 

For i = 0, 1 let 

gn,i{t)= / kh{t - U) dGn,i{u) 



be kernel (sub-density) estimates based on the observations Tj for which 
Aj = i, and let gn{t) =5n,i(i) +5n,o(0- Also define the associated (sub-) 
distribution functions 



Gn,i{t)= / gn,i{u)du, fori = 0,1, and G„(t) = / gn{u)du. 

J[0,t] J[0,t] 

Because X >0, we can expect inconsistency problems for the kernel den- 
sity and density derivative estimators at zero. In order to prevent those, we 
modify the definition of gn,i for t <h. To be precise, we define 

gndt) = j i~tr) < t < /l, 

for P = t/h where the so-called boundary kernel k^^ is defined by 



iB( N Jy2,i3{k) - Ul,f}{k)u 



with. Vi p{k) = / u^k{u) du,i = 0,1,2. 

Let the estimators g'^ ^ be the derivatives of gn,i-, for i = 0, 1. There are other 
ways to correct the kernel estimator near the boundary, see, e.g., Schuster 
(1985) or Jones (1993). However, simulations show that the results are not 
much influenced by the used boundary correction method. 

Having made these choices for the smoothed empirical distribution P„, 
let us return to the MSLE. It is the maximizer of over the class of all 
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(a) (b) 

Fig. 1. A part of the CCSD, its lower convex hull and the estimates F^^'™ and for 
Fo based on simulated data, with n — 500. (a) Part of the CCSD (grey line) and its lower 
convex hull (dashed line); (b) estimates F^'^^^° (grey hue) and (dashed Itne) of Fo 

(dotted line). 



distribution functions. One could also maximize over the bigger class of 
all functions, maximizing the integrand of (3.1) for each t separately. This 
results in 

9n{t) gn[tr 

where 

(3.7) 9nit)=9n,oit)+9n,lit). 

We call these naive estimators, since /°^'™ might take negative values, mean- 
ing that F"^'™ decreases locally. 

Figure 1(a) shows a part of the CCSD defined in (3.2) and its lower convex 
hull. Figure 1(b) shows the naive estimator F^^"''' (the grey line), the MSLE 
and the true distribution for a simulation of size 500. The unknown 
distribution of the variable X is taken to be a shifted Gamma(4) distribution, 

i.e., fo{x) = exp(— (x — 2))l[2^oo) (^)) ™d the censoring variable T has 

an exponential distribution with mean 3, i.e., g{t) = ^exp(— t/3)l[o,oo)- For 
the kernel density, we took the triweight kernel k{t) = ||(1 — i^)'^l[_i,i] (i) 
and as bandwidth h = 0.7. This picture shows that the estimator F^^ is the 
isotonic version of the estimator F^^"'^ . 

The next theorem shows that for appropriately chosen h, the naive estima- 
tor F^^"° will be monotonically increasing on big intervals with probability 
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converging to one as n tends to infinity if and G satisfy conditions (F.l) 
and (G.l). 

Theorem 3.3. Assume Fq and G satisfy conditions (F.l) and (G.l). 
Let (jn and gn,i be kernel estimators for g and gi with kernel density k 
satisfying condition (K.l). Let h = cn~°^ (c> 0) be the bandwidth used in 
the definition of gn and gn,i- Then for all <m < M < Mq and a E (0, 1/3) 
the following holds 

(3.8) P(F^^'™ is monotonically increasing on [m,M]) — > 1. 

Note that this theorem as it stands does not imply that F^^{t) = F"^*™(i) 
on [m, M] with probabihty tending to one. Some additional control on the 
behavior of F^^'™ on [0, m) and (M, Mq] is needed. The proof of the corollary 
below makes this precise. 

Corollary 3.4. Under the assumptions of Theorem 3.3, it holds that 
for all 0<m<M < Mq and a G (0, 1/3), 

(3.9) PiFr^'it) = F^f'^it) for all t G [in, M]) ^ 1. 

Consequently, for allt> the asymptotic distributions of F^^{t) and F^^"'^{t) 
are the same. 

In van der Vaart and van der Laan (2003), a result similar to our Corol- 
lary 3.4 is proved for smooth monotone density estimators. The kernel esti- 
mator is compared with an isotonized version of this estimator. Their proof 
is based on a so-called switch-relation relating the derivative of the convex 
minorant of a function to that of an argmax function. The direct argument 
we use to prove Corollary 3.4 furnishes an alternative way to prove their 
result. 

By Corollary 3.4, the estimators F^^(t) and F°^'™(i) have the same 
asymptotic distribution. The same holds for f^^{t) and fn^^^^{t) as well as 
for A^^(t) and A?,^''"=(t). The pointwise asymptotic distribution of F^'''™(t) 
follows easily from the Lindeberg-Feller central limit theorem and the delta 
method. The resulting pointwise asymptotic normality of both F^^{t) and 
Tn^"^{t) is stated in the next theorem. 

Theorem 3.5. Assume Fq and G satisfy conditions (F.l) and (G.l). 
Fix t>0 such that /q and g" exist and are continuous at t and g{t)fQ{t) + 
2/0(^)5' (*) 7^ 0. Let h = cn~^^^ (c > Oj be the bandwidth used in the definition 
of (jn and gn,i. Then 

n"\F^%t) - Fo(t)) - AA(/.^,MS,4,Ms), 
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where 



9{t) 



HFM^ = \(?^2{k)^fa{t) + 2 

This also holds if we replace F^^ by F^^'™. 

For fixed t>0, the asymptotically MSE-optimal bandwidth h for F^^{t) 
is given by /in,F,MS = CF,Msn~^^^ , where 

1/5 



(3.10) 



, Fo(t)(l-Fo(t)) . , 
CF,MS = i / k{uy du 



y<{ml{k)lf'^{t) + 2 



Mt)g'{t) 

9{t) 



2n -1/5 



Proof. For fixed c > 0, the asymptotic distribution of F^^"'^ follows im- 
mediately by applying the delta method with v) =v/uto the first result 
in Lemma A. 3. By Corollary 3.4, this also gives the asymptotic distribution 
of F^^ 

To obtain the bandwidth which minimizes the asymptotic mean squared 
error (aMSE) we minimize 

aMSE(iS:^>^c) = lcVi(t)|/i(t) + 2^^^^}' 

9{t) J 

with respect to c. This yields (3.10). □ 

Remark 3.1. In case 5(t)/o(i) + 2fo{t)g'{t) = 0, the optimal rate of 
hn,F,MS is n~^^^ resulting in a rate of convergence n~^^^ for F^^ . This is in 
line with results for other kernel smoothers in case of vanishing first-order 
bias terms. 

The pointwise asymptotic distributions of f^^{t) and fn^"^{t) also follow 
from the Lindeber-Feller central limit theorem and the delta method. 

Theorem 3.6. Consider f^^ as defined in (3.4) and assume Fq and G 
satisfy conditions (F.l) and (G.l). Fix t>0 such that fjf^ and g^^^ exist 
and are continuous at t. Let h = cn~^^'^ (c> 0) be the bandwidth used to 
define F^^ . Then 

n^"{f^Ht) - fo{t)) - AA(^^,MS,4,Ms), 
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where 

M/,MS - m,{k) ^/o (t) + 2 — 2 

= : ^c^m2{k)q{t), 

^/,MS = J k (n) du 

for t such that q{t) ^ 0. This also holds if we replace f^^ by Z"**'™. 

For fixed t > 0, the aMSE-optimal bandwidth h for f^^{t) is given by 
hnjMS = c/,MS""^''^; where 

(3.11) c;.MS = {3 ^°^'^^^~^^°^'^^ I k'{ufduY\ml{k)q\t)ry'^. 

Proof. Write gn{t) = g{t) + Rn{t) and gn,i{t) = gi{t) + Rn,i{t), so 

n'"{fT^%t)-f,{t)) 

_ ^i^f a{t)g'^,^^{t)-Ut)9i{t) g{t)g[{t)-g'{t)g^{t) \ , ^ 



for 



2/7 [g(^) + ^n(0]g;,i(t) - 9'nit)[9l{t) + i?n,l(t)] 
[g(i) + i?n(t)]2 
2n 9{t)9'n,l{t)-9'nit)9l{t) 

9{t? 

2n Rn{t)9'n,l{t) - 9'n{t)Rn,l{t) 



n 



[9{t)+Rn{t)Y 

-n' {g{t)g^^,{t)-g^{t)g,{t)) + • 

Applying the delta method with ipiu^v) = {g{t)v — gi{t)u) / g{t)'^ to the 
last result in Lemma A. 3 gives that 

2n( 9{t)9'nAt)-9'n{t)9i{t) g{t)g[{t)-g'{t)g,{t) \ , 

" V ) ^■^^'^^'^/'^s) 

for 

r-^ 1 9^ 1 2 / , y f "r.^ ^ ■, g"(^)/o(t)+g^(t)/^(t) \ 

(3.12) ^1 = -c m2(A:)( /o(t) +3 I. 
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By Lemma A.3 n^/'^ Rn{t) — > \c^m2{k)g" {t) and n'^l'^ Rn,i{t) 



X 

m2{k)gi{t), so by the consistency of g'^ and g'^^^, see Lemma A. 2, and the 
continuous mapping theorem we have 

2 V aity git) J 

Hence, we have that 

n^/'(/r™(t) - hit)) - AA(^/,MS,CT^,MS) 

for /ti/,MS = fJ'i + 1^2- By Corollary 3.4, this also gives the asymptotic distri- 
bution of /^s. 

The optimal c given in (3.11) is obtained by minimizing 



aMSE(/r,c) = ^cVi(fc)g^(t) + c-^ ^°^^^^^^;^^°^^^^ J k'iu)'du. 



□ 



Corollary 3.7. Consider of Xq as defined in (3.4) and let h = 
> oj be the bandwidth used to compute it. Assume Fq and G satisfy 
conditions (F.l) and (G.l). Fix t > such that Foit) < 1 and f^^^ and g^^'^ 
exist and are continuous at t. Then 



n 



2/^(Ar(t)-Ao(t))-AAKMS,alMs), 



where 

„ -^r^m (h) ^ (f"(f) I J'(t)Mt)+s'it)f[>it) ,g'it)'foit) 



2 'l-Fo(t)V'^^ ' 9it) 9it)^ 



for t such that r(t) 7^ 0. This also holds if we replace by AJ^^'™. 

For fixed t > the aMSE-optimal bandwidth h for A^^(t) is given by 
K,x,MS = cx,Msn~'^^'^ , where 

(3.13) c,,MS = [^^-j^^^^ J k' in)' duY\mlik)r\t)r'/'. 
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Proof. Write F^^{t) = Fo{t) + then 

l-Fo(t)' 



(3.14) n'/\X^'{t) - Xo{t)) = -^^(/MS(i) _ /o(t)) + r„(t) 



with 



I - Foit) - Rnit) l-Foit)J- 
If h = cn~^/'^ is the bandwidth for F^^{t), then 

by Lemma A. 3 and the delta method. This implies that n?/'^ Rn{t) fJ'F,MS 
and 

^" (1 - Fo(t))(l - Fo(t) - R4t)) (1 - Fo(t))^ 
Since we also have that 

(^)-^°(*))^^^T37bM'(i-Fo(t)p 

we get that /xa,ms = /^/.ms/CI - ^o(i)) + AtF,Ms/o(i)/(l - -Pb(i))^- 
The optimal c given in (3.13) is obtained by minimizing 

aMSE(Ar,c) = ^cX(fc)^^(t) + c"^ ^^,^^f°_^'|^^^^^ / k'{ufdu. ^ 

4. Smoothed maximum likelihood estimation. In the previous section, 
we started smoothing the empirical distribution of the observed data, and 
used that probability measure instead of the empirical distribution function 
in the definition of the log likelihood. In this section, we consider an esti- 
mator that is obtained by smoothing the MLE (see Section 2). Recall the 
definitions of the scaled versions of K, k and k' , given in (3.5) 

Kh{u) = K{u/h), kh{u) = —k(u/h) and k'i^{u) = -j-^k' {u/h). 

Define the SMLE F^m 

F^^{t) = j Kh{t-u)dFn{u). 

Similarly, define the SMLE f^^ for /o and the SMLE A^^ of Aq by 

ff^{t) = jkH{t-u)dF^{u) and Xf" [t) = ff^ [t) / {I - Pf^ {t)) . 

In this section, we derive the pointwise asymptotic distributions for these 
estimators. First, we rewrite the estimators -F^^ and f'^- 
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Lemma 4.1. Fix t > 0, such that g{u) > in a neighborhood of t and 
define 

(4.1) Vm(^) = ^^W^> ^h,t{u)- 



g{u) 
Then 



(4.2) J Kh{t-u)diFn-Fo){u) = - J iJh,tiu)i6-Fn{u))dPo{u,6), 

(4.3) / kh{t-u)diFn-Fo){u) = - [ iphAu){S-Fniu))dPoiu,6). 



Proof. To see equality (4.2), we rewrite the left-hand side as follows 

rt+h 

/ Kh{t-u)d{Fn-Fo){u) 
Jo 

= f d{Fn-Fo){u)+ Kh{t-u)d{Fn-Fo){u) 

Jo Jt-h 



Fn{t -h)- Fo{t -h) + Kh{t - u){Fn{u) - F^{u))\'+Xh 

r-t+h 

-{Fn{u)-Fo{u))kh{t-u) du 



''''^^^^^^iFr,{u)-Foiu))dGiu) 



^hAu){6-Fn{u))dPo{u,5). 
Equation (4.3) follows by a similar argument. □ 

Hence, in determining the asymptotic distribution of the estimators F^^(t) 
and fn^{t), we can consider the integrals at the right-hand side of (4.2) and 
(4.3). The idea of the proof of the asymptotic result for F^^{t), given in the 
next theorem proven in the Appendix, is as follows. By the characterization 
of the MLE, given in Lemma A. 5, we could add the term dFn for free in the 
right-hand side of (4.2) if ^lJh^t were piecewise constant. For most choices of k 
this function -0/1,* is not piecewise constant. Replacing it by an appropriately 
chosen piecewise constant function results in an additional Op-term which 
does not influence the asymptotic distribution. By some more adding and 
subtracting, resulting in some more Op-terms, we get that 

-n2/5 / ^hA^){5-Fn{u))dPo{u,5) 
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= n2/5 j - Fo(n)) d(P„ - Po){u, 5) + Op{l) 

and the pointwise asymptotic distribution follows from the central limit 
theorem. 

Theorem 4.2. Assume Fq and G satisfy conditions (F.l) and (G.l). 
Fix i > such that /q is continuous at t and f'^it) 7^ 0. Let h = cn~°' (c> 0) 
be the bandwidth used in the definition of F^^. Then for a = 1/5 

where 

(4.4) ^^F,sM = ic2m2(fc)/^(t), ^gM = ^'^'^^^^'^'^^ J ^W'^"" 

For fixed t > the aMSE-optimal bandwidth of h for estimating F^^{t) 
is given by /in,F,SM = CF,SMn~^^^ , where 

(4.5) c^.sM = { ''°^'^^^^;)''°^'^^ / k{ufduY\mUk)f[,itn-^/\ 

Theorem 4.3. Assume Fq and G satisfy conditions (F.l) and (G.l). 
Fix t > such that /q is continuous at t and /q (t) ^ 0. Let h = cn~^^'^ 
(c>0) be the bandwidth used in the definition of f^ . Then 

where 

1 2 ti\fiiu\ 2 Fo{t){l - FQ{t)) [ I 2 , 
/^/,SM = 2C "i2(A;)/o (t), C7^,SM = ^3^^^^ j k (n) du. 

For fixed t > the aMSE-optimal value of h for estimating fn^{t) is given 
by hnj,sM = c/,sM"-~"^/'^, where 

(4.6) c;,sM = {^ '''^'^^^'''^'^^ I k'{ufduY\mlik)fi;itfr'/\ 

The proof of this result is similar to the proof of Theorem 4.2, hence it is 
omitted. 

Corollary 4.4. Assume Fq and G satisfy conditions (F.l) and (G.l). 
Fix t > such that Foi^t) < 1, /g is continuous in t and fQ^t) 7^ 0. Let h = 
cfi-^n (c>Q) be the bandwidth used to compute A^^. Then 

n"\\f^{t) - Ao(t)) -^Mifix,SM,alsM), 
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/^A,SM 



where 

l/2c^m2{k) 
l-Fo{t) 

/or t suc/i i/rnt (1 - Fo{t})fS{t) + /o(t)/i(t) 7^ 0. 

For fixed t > the aMSE-optimal bandwidth h for \^{t) is given by 
hn,x,SM = '^l.sM^'^^'^ > where 



(4.7) 



1/7 



{l-F^{t)fy^" l-Fo(t) 



Proof. The proof uses the same decomposition as the proof of Corollary 
3.7, but now n'^l'^Rn{t) \c^m2{k)fl^{t). This gives that 

Tn{t) — > -c m2{k)fo{t)- — = ^F,SM- 



2^ ^^"^"^^(l-Fo(i))2 -^'^^^^(l-Fo(t))2 
and /iA,SM = /i/,SM/(l - i^o(t)) + /^F,SM/o(i)/(l - Foit)^- □ 

5. Bandwidth selection in practice. In the previous sections, we derived 
the optimal bandwidths to estimate Oq{F) [the unknown distribution func- 
tion Fq, its density /o or the hazard rate Aq = /o/(l — -^o) at a point t] 
using two different smoothing methods. These optimal bandwidths can be 
written as ^„ g(^) = c§f^p^n~" for some a > (either 1/5 or 1/7), where 
Cg^p^ is defined as the minimizer of aMSE(c) over all positive c. For example 

9o{F) =Fo(t) and 0{F) = F^^{t). However, the asymptotic mean squared 
error depends on the unknown distribution Fq, so c^^^,^ and h^^^p,^ are un- 
known. 

Several data dependent methods are known to overcome this problem by 
estimating the aMSE, e.g., the bootstrap method of Efron (1979) or plug-in 
methods where the unknown quantities, like /o or /q , in the aMSE are re- 
placed by estimates [see, e.g., Sheather (1983)]. We use the 
smoothed bootstrap method, which is commonly used to estimate 
the bandwidth in density-type problems, see, e.g., Hazelton (1996) and 
Gonzalez-Manteiga, Cao and Marron (1996). 

For 9{F) = F^^{t) the smoothed bootstrap works as follows. Let n be the 
sample size and ho = con~^/^ an initial choice of the bandwidth. Instead of 
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sampling from the empirical distribution (as is done in the usual bootstrap) 
we sample X^' , X2 , • • • , Xm {m < n) from the distribution (where we 
explicitly denote the bandwidth /iq used to compute F^^). Furthermore, we 
sample T^'^, ■ ■ ■ ,Tm^ from Gn,ho and define A*'^ = Based on the 

sample {T^'^ , A^'^), . . . ,{Tm^ , A*a), we determine the estimator -^^'^^-i/s 

with bandwidth h = cm~^^^ . We repeat this many times (say B times), and 
estimate aMSE(c) by 

B 

MSEs(c) = i?-i5^(FSM^_,/,(i) -F„,ho(t))'. 

i=l 

The optimal bandwidth /in,F,SM we estimate by hn^F,SM = cf.sm^^""^'^^ where 
cf,sm is defined as the minimizer of MSEb(c) over all positive c. For the 
other estimators, the smoothed bootstrap works similarly. 

Table 1 contains the values of cp^sM and /in,F,SM for the different choices 
of Co and two different points t based on a simulation study. For the dis- 
tribution of the Xi, we took a shifted Gamma(4) distribution, i.e., fo{x) = 

^^31^^ exp( — (a; — 2))l[2^oo)(^)i and for the distribution of the Tj we took an 
exponential distribution with mean 3, i.e., g{t) = |exp(— t/3)l[o^oo)- Further- 
more, we took n = 10,000, m = 2000, B = 500 and k{t) = ^{l-t'^fli_i^i]{t), 
the triweight kernel. The table also contains the theoretical aMSE optimal 
values cf,sm, given in (4.5), the values of cf,sm using Monte Carlo simu- 
lations of size n = 10,000 and m = 2000 and the corresponding values of 
^n,F,SM and /in,F,SM- In the Monte Carlo simulation, we resampled B times 
a sample of size n (and m) from the true underlying distributions and esti- 
mated, in case of sample size n, the aMSE by 

B 

MSEb(c) = B-' - Foit)f. 

4 = 1 

Then cf,sm is defined as the minimizer of MSE5(c) over all positive c and 
hF,SM = cf.sm?^"^^^- Figure 2 shows the aMSE(c) for t = 4 and its estimates 
MSEb(c) with Co = 15 and MSEb(c). Figure 2 also shows the estimator F^^ 
with bandwidth h = 1.7 (which is somewhere in the middle of the results in 
Table 1 for cq = 15), the maximum likelihood estimator Fn and the true 
distribution Fq. 

We also applied the smoothed bootstrap to choose the smoothing pa- 
rameter for F^^(t) based on the hepatitis A prevalence data described by 
Keiding (1991). Table 2 contains the values of cf,sm and /in,F,SM for three 
different time points, t = 20, t = 45 and t = 70 and for different values of co- 
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Table 1 



Minimizing values for c and corresi 


■jondmg values of the bandwidth based on 


the smoothed 


bootstrap method for 


different values of Co, based on 


Monte Carlo simulations and the 






theoretical values 








t 


= 4.0 


t = 


6.5 




Cf, sm 


hn,F, SM 


Cf, sm 


'l-n,F, SM 


Co = 5 


6.050 


0.959 


9.150 


1.450 


Co = 10 


7.350 


1.165 


10.100 


1.601 


Co = 15 


7.700 


1.220 


12.050 


1.910 


CO = 20 


7.850 


1.244 


14.150 


2.243 


CO = 25 


9.850 


1.561 


15.500 


2.457 


MC-sim (n) 


6.700 


1.062 


10.700 


1.696 


MC-sim (m) 


6.750 


1.070 


11.600 


1.838 


Theor. val. 


6.467 


1.025 


10.426 


1.652 




5 10 15 2 4 6 8 10 12 



Fig. 2. Left panel: the aMSE of F^^{A) (dotted line) and its estimates based on the 
smoothed bootstrap (solid line) with co = 15 and the Monte Carlo simulations (dashed 
lines) with sample size n (black line) and m (grey line). Right panel: the true distribution 
(dash-dotted line) and its estimators F^^^ with h = 1.7 (solid line) and Fn (step function). 

The size n of the hepatitis A prevalence data is 850. For the sample size m of 
the smoothed bootstrap sample, we took 425 and we repeated the smoothed 
bootstrap B = 500 times. If we take the smoothing parameter h equal to 25 
(which is somewhere in the middle of the results in Table 2), the resulting 
estimator is shown in Figure 3. The maximum likelihood estimator 
is also shown in Figure 3. 
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Table 2 

Minimizing values for c and corresponding values of the bandwidth based on the smoothed 
bootstrap method for different values of co and for three different values of t 







t = 


20 


t = 


45 




t = 70 


Cf, SM 


hn,F, SM 


Cf, SM 


hn,F, SM 


Cf, SM 


hn,F, SM 


Co 


= 50 


107.7 


27.947 


60.3 


15.647 


128.9 


33.448 


Co 


= 60 


105.6 


27.402 


67.6 


17.541 


128.7 


33.396 


Co 


= 70 


106.7 


27.687 


67.8 


17.593 


127.4 


33.059 


Co 


= 80 


101.8 


26.416 


71.6 


18.579 


130.4 


33.837 


Co 


= 90 


92.5 


24.003 


70.4 


18.268 


131.0 


33.993 


Co 


= 100 


91.9 


23.847 


76.5 


19.851 


127.5 


33.085 


Co 


= 110 


90.5 


23.484 


75.9 


19.695 


126.2 


32.747 


Co 


= 120 


89.8 


23.302 


80.8 


20.967 


124.3 


32.254 


Co 


= 130 


89.4 


23.198 


81.0 


21.018 


124.5 


32.306 


Co 


= 140 


84.2 


21.849 


81.9 


21.252 


120.2 


31.190 


Co 


= 150 


87.3 


22.653 


88.7 


23.017 


117.4 


30.464 



6. Discussion. We considered two different methods to obtain smootli 
estimates for the distribution function Fq and its density /o in the current 
status model. Pointwise asymptotic results show that for estimating any 
of these functions both estimators have the same variance but a different 
asymptotic bias. The asymptotic bias of the MSLE equals the asymptotic 
bias of the SMLE plus an additional term depending on the unknown den- 
sities /o and g (and their derivatives) and the point t we estimate at. For 
some choices of /o and g this additional term is positive, for other choices 
it is negative. Hence, we cannot say one method always results in a smaller 
bias than the other method, i.e., one estimator is uniformly superior. This 
was also seen by Marron and Padgett (1987) and Patil, Wells and Marron 
(1994) in the case of estimating densities based on right-censored data. Fig- 
ure 4 shows the asymptotic mean squared error of the estimators F^^{t) 
and F^^{t) if Fq is the shifted Gamma(4) distribution and G is the expo- 
nential distribution with mean 3, i.e., fo{x) = ^^gf^ exp(— (x — 2) )1 [2^00) (2^)) 
g{t) = |exp(— i/3)l[o_oo) and c = 7.5. For some values of t the aMSE of 
F^^{t) is smaller [meaning that the bias of F^^{t) is smaller], for other 
values of t the aMSE of F^^{t) is smaller [meaning that the bias of Fn^{t) 
is smaller]. 

We also considered smooth estimators for the hazard rate Aq, defined as 
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where /„ and F„ are either and or and F^^. Because A„(i) 
is a quotient, we could estimate nominator and denominator separately by 
choosing one bandwidth h = cn~^l'^ to compute /n(i) and a different band- 
width h\ = c\n~^l'^ to compute Fn{t). However, by the relation 




20 40 60 



Fig. 3. The estimators F^^^ (solid line) and F„ (dashed line) for the hepatitis A preva- 
lence data. 




n \ \ r 

5 10 15 20 



Fig. 4. The aMSE of F^^{t) (solid line) and F^^{t) (dashed line) as function oft m 
the situation described in Section 6. 
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it is more natural to estimate fo{t) and -Fo(i) with the same bandwidth. As 
for the estimators for /o and Fq, we cannot say the estimator Aj^^(i) with 
bandwidth of order is uniformly superior to A^^(i) with bandwidth 

of order n~^/^. 

APPENDIX: TECHNICAL LEMMAS AND PROOFS 

In this section, we prove most of the results stated in the previous sections. 
We start with some results on the consistency and pointwise asymptotics of 
the kernel estimators gn, g'n, Gn, gn,i, g'n,i and Gn,i- 

Lemma A.l. Let gn be the boundary kernel estimator for g, with smooth- 
ing parameter h = n~" (a < 1/3). Then with probability converging to one 
gn is uniformly bounded, i.e., 



(A.l) 



3C>0:P( sup \gn{x)\ <c)^l. 



Proof. First note that without loss of generality we can assume < 
k(u) < A;(0). Recall that Vi^p{k) = J^-^u^k{u) du for /3 G [0,1], for which we 
have the following bounds 

i^o,/3 > h WiM < iEfc|f/|, I Varfc U < i^2,i3 < Var^ [/, 

where U has density k. Combining this, we get that i^o,/3Z^2,/3 — z^i ^ > 3 Var/; \ U\ > 
0, so that we can uniformly bound the kernel k^ by 



\kP{u)\ 



2 Ku)l{^l,p]{u) 



For the boundary kernel estimate gn, we then have 
h'^ [ k^{{x-y)/h)dGn{y) 



\9nix)\ 

< h-'^ck{0)\Gn{x + h)- Gn{x - h)\ 

< h~^ck{0)\Gn{x + h)- G{x + h) - Gn{x - h) + G{x - h)\ 
+ h-^ck{id){G{x + h)- G{x - h)) 

<ck{Q)n''-^l^2snY>M^n{y)-G{y)\+2\\g\\^ck{f)) 
y>o 

= Op(n"-i/2) + 2||<7|UcA;(0). 
Since this bound in uniform in x, (A.l) follows for C = 3||5||oocA:(0). □ 
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Lemma A. 2. Assume g satisfies conditions (G.l) and let Gn,Gn,i, dn, dn,!, 
g'^ and g'^^ be kernel estimators for G, Gi, g, gi, g' and g[ with kernel 
density k satisfying condition (K.l) and bandwidth h = cn~" (c>0). For 
a £ (0, 1/3) and m> 



(A.2) 



(A.3) 



sup \gn{t)-g{t)\^0, sup \g',{t) - g'{t)\ ^0, 

t£[m,oo) tG[m,oo) 

sup |G„(t)-G(t)| Ao, 

tG[0,2A/o] 

sup \gn,iit) - giit)\ — >0, sup |5n,i (t) " g[ it)\ — ^ 0, 

iG[r?i,oo) iG[r?i,oo) 

sup \Gn,i{t)-Gi{t)\^0. 

te[0,2Mo] 



Proof. Let gJJ be the uncorrected kernel estimate for g and note that 
by properties of the boundary kernel estimator we have for all x > /i 

9nix)=gn{x). 

Hence, the first two results in (A.2) follow immediately from Theorems A 
and C in Silverman (1978). To prove the third result in (A.2), fix M > Mq, 
e > and choose < 6 < e/ (2C) such that G{6) < e/A, where G is such that 
(A.l) holds. For all x > and n sufficiently large (such that h = hn< (5), we 
then have 

\Gn{x)-G{x)\<6 s^xv \gn{y)\ + G{5) + snv\Gn{y) - G{y)\. 

S/G[0,5] y>S 
The right-hand side does not depend on x so that 
P(||Gn-G||oo>e) 



<P[5 sup \g^{y)\+G{5)+snv\Gl{y)-G{y)\>e 
y&[0A y>s 



<P{5 sup \gn{y)\+Gi6)+sup\G^iy)-G{y)\>e 
^ ye[0,i] y>S 

= P(\6 sup \gniy)\ + G{S) + sup|G::(y) - G(y)| > e| 

ye[0,i] y>S 



n| sup \gniy)\<c}) 



^2/e[o,i] 



+ P({5 sup \g^{y)\+G{5)+sup\G:^{y)-G{y)\>e} 
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n| sup \gniy)\>c} 
Se[o,i] J 

<p(snp\G'^{y)-G{y)\>e/A). 

The last probability converges to zero as a consequence of Theorem A in 

Silverman (1978), hence ||G„ — G||oo — > 0. 

For the first result in (A. 3), define a binomially distributed random vari- 
able A^i = X]"=iAj with parameters n and p = P(Ai = 1) = J FQ(u)g{u) du, 
and the probability density g(t) = gi{t)/p. Let Vi, . . . ,Viy-^ be the Tj such 
that Ai = 1, and rewrite gn,i{t) as Y^^=i kh{t - Vi) = ^gN^{t). Then we 
have by the triangle inequality 



|5n,l — ffllloo 



gNi- 



n 



gp 



<p\\gNi - g\\oo + llffAfilloo 



n 



P 



The first term on the right-hand side converges to zero in probability by 

■p 

Silverman (1978), since A'^i — > oo as n — )■ oo. For the second term on the 
right-hand side, note that 



\\9Ni\ 



Wg + gNi -g\\oc < II5II00 + ll^^vi -^lU, 



where the last term again converges to zero in probability by Silverman 
(1978). Combining this with the Law of Large Numbers applied to — p| 

V 



gives that ||^7Vi ||oo|-^7 — p\ — > as n — >• oo, hence \\gn,i — gi 
proofs of the other results in (A. 3) are similar. □ 



0. The 



Lemma A. 3. Let gn and gn,i be kernel estimates for g and gi with kernel 
density k satisfying condition (K.l) and bandwidth h = cn~^ (c>Q). Fix 
t>Q such that /q and g" exist and are continuous at t. Then for a = 1/5, 



(A.4) n2/5 

with 
(A.5) 

For < a < 1/5 
and 



9n{t) \_( g{t) \ 



gn,i{t) J \gi{t) J 

k{u)'^ du 



\^m2{k)g"{t) 
\c^m2{k)g'i{t) 



git) gi{t) 
gi{t) gi{t) 



Sc'm2{k)g"{t) 



n'"(9n,i(t) -5i(t)) A \^m2{k)g'i{t). 
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Let g'^ I and g'^ he as defined in (3.7). Then for fixed t>0 such that /q 
and g^^^ exist and are continuous at t and a = 1/7, 

^^•'^ " lUn,iW J - Uit))) ^'^[[lc^mmg?\t) ) '''V 
with 

Proof. We start with the proof of (A. 4). Define 
V. - fYi-l\ _^^3/5 f kh{t-Ti) 

By the assumptions on /o and g and condition (K.l), we have 

^3,^( g{t) + \h^m2{k)g%t) + Op{h^)\ 



FYi = n 



,giit) + ^h^m2{k)g'l{t) + Op{h^)J ' 

X:Vary.^c-/.(n)^..(;/;0 ) + C),(n-/.). 

1=1 

By the Lindeberg-Feller central hmit theorem, we get 

„2/,(( 9n{t) \ ( 9{t) W 

where Si is defined in (A. 5). 

To prove that n^'^{gn{t)-g{t)) \c^m2{k)g" {t) for < a < 1/5, define 
Wi = n^'^-^khit - Ti). Since we have 

BW, = n'^-\g{t) + ^h^g"{t) + Op{h^)), 
n\aTWi = n^''-^c-^g{t) j k{uf du + Op{n'^''-^) = Op{n^'^~^), 
we have that ^ Var Wi — > for < q < 1/5, hence 



n 



\ i=l J 



^ n'^^iUt) - 9{t)) - -c^m2{k)g"{t) + 0^(1) A 0. 



2 



Similarly we can prove that n?"{gn,i{t) — gi{t)) ^c^m2{k)g'({t). 
The proof of (A. 6) is similar as the proof of (A. 4). □ 

Using these results we now can prove the results in Section 3. 
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Proof of Lemma 3.2. The proof of the inequahties in Lemma 3.2 is 
based on the Monotone Convergence theorem (MCT). Denote the lower con- 
vex huh of the continuous cusum diagram defined in (3.2) by 1 1— )■ C„(t)) 
for t £ [0, r], where r = sup{t > : gnfl{t) + gn,i{i) > 0}. By definition of this 
convex hull, we have for all t > 



Gn,l{t)= / l[o,t]{u) dGn,l{u) > / lio^t^{u) dCn{u) 

(A.8) 

The function 1[q^{u) is decreasing on [0,oo). Consider an arbitrary distri- 
bution function F on [0, oo) and write p{t) = — log F{t). Then, on [0,r], the 
function p can be approximated by decreasing step functions 

m 

Pm{t) = ^^ajl[o^a;^](t) with Oj > Vi and < xi < • • • < Xm < t. 

i=l 

The functions pm can be taken such that pm tP^ on [0, r]. For each m, we 
have 



^ ni p 

Pm.{t)dGn,l{t) = / ail[o,a;,](t)(iGn,l(i) 

1=1 
m 

(A.9) >Y,J ail[o,.,]{t)dCn{t) 



p^{t)F^'''it)dGnit). 
The MCT now gives that for each n 

Jim^ J Pm{t)dGn,l{t) = j p{t)dGn,i{t) = - j log (t) , 

hm / p^{t)dCn{t) = I p{t)dCn{t) = - [ F^^{t)logFit)dGn{t). 



Combined with (A.9), this implies the first inequality in Lemma 3.2. 
To prove the second inequality in Lemma 3.2, it suffices to prove 

(A.IO) j \ogil- Fit)) dGn,iit)> J F^^it)\ogil-F{t))dGn{t), 



since 



I log(l - F{t)) dGnfl{t) = j log(l - F{t)) d{Gn - Gn,l){t). 
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The proof of (A. 10) follows by a similar argument. Then we use approxima- 
tions qm{t) of the decreasing function q{t) = log(l — F{t)) such that qmt Q 
to prove (A. 10). 

For the equality statements for F = in Lemma 3.2, we can also use 
the monotone approximation by step functions, restricting the jumps to the 
points of increase of F^^ [i.e. , points x for which F^^ (x + e) — FI^^ (x — e) > 
for all e > 0] implying equality in (A. 9). □ 

Proof of Theorem 3.3. Take < m < M < Mq. By assumption (G.l) 
and Lemma A. 2, with probability arbitrarily close to one, we have for n suf- 
ficiently large that gn{t) > for ah t G [m, M]. We then have that F^^'™(t) = 
9n,i{t) / gn{t) is well defined on [m,M] and to prove that F^^"^{t) is mono- 
tonically increasing on [m, M] with probability tending to one, it suffices to 
show that 36 > such that \/rj > 

(A.ll) p(yt G [m,M] : j/f^^it) >^>l-r] 

for n sufficiently large. We have that 

d_ - naive ^ 9n{t)g'n^i{t) - gn,l{t)g'^{t) 

dt ^ ^ [Ut)? 

which is also well defined. 

To prove (A.ll) it suffices to prove 35 > such that V?/ > 

(A. 12) P(Vt G [m,M]:gnmn,iit)-gn,imnii) >S)>l-7i 
for n sufficiently large. For this, we write 

9n{t)gn,l{t) - gn,l{t)gn{t) 

= 9nm9n,lit) - 9[it))+gn,lit)i9'it) - g'nit)) 

+ 9[{t){gn{t) - g{t)) + g'{t){gi{t) - gn,i{t)) + g{t)g[{t) - g'{t)gi{t) 
>- sup \g'^^^{t) - g[{t)\ sup ^„(t) 

te[m,M] te[m,J\/] 

- sup \g^{t) - g {t)\ sup 5n,i(t) 

- sup \gn{t)-g{t)\ sup g[{t) 

- sup \gn,i{t) - gi{t)\ sup g'{t) 

te[m,M] tG[m,A/] 

+9\t)m. 
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By Lemma A. 2 and assumptions (F.l) and (G.l), we have that (A. 12) fohows 
for 6 < infte[„^M] 5'^(i)/o(i)- ^ 

Proof of Corollary 3.4. Fix 6 > arbitrarily. We will prove that 
for n sufficiently large 

P(F^^'™(t) = if for all t G [m, M])>l-6. 

Define for r/i G (0,m), 772 G {0,Mo — M) and n>\ the event An by 

An = {Fn^"'^{t) is monotonically increasing and gn{t) > 

for t G [m — ryi , M + r/2] } . 

By Lemma A. 2 and Theorem 3.3, we have for all n sufficiently large P{An) > 
1-5/10. 

Define the "linearly extended Gn,i" by 

r G„,i(m) + {Gn{t) - G'„(m))Fr™(m), for t G [0,m), 
C:(i) = < Gn,i(i), fortG[m,M], 
I G„,i(M) + (G„(t) - G„(M))Fr™(M), for t G (M,Mo]. 

It now suffices to prove that for all n sufficiently large 

(i) P{{{Gn{t),G*{t)):t>0} convex) > 1 - 5/2, 

(ii) P(Vt G [0,Mo] :C:(i) < Gn,i{t)) > 1 - 5/2. 

Indeed, then with probability >1 — 6 the curve {{Gn{t),G*{t)) :t > 0} is a 
lower convex hull of the CCSD {(G„(t), G„,i(i)) : * > 0} with G*{t) = G„,i(0 
for all t G [m, M]. From this, it follows that G*(i) = G„(t) for all t G [m,'M], 
hence also G„(t) = Gn^i(t) for all t G [m,M]. This implies that for n suffi- 
ciently large 

P(yt G [rn,M] (t) = = = ^nHt)) > I - S. 

\ dGn{t) dGn{t) / 

We now prove (i). For the intervals [0,m) and {M,Mq] the curve {(G„(t), 
G*(t)) : t > 0} is the tangent line of the CCSD at the points (G„(m), G„,i(m)) 
and (G„(M), G„^i(M)), respectively, so on the event An the curve is convex. 
This gives for n sufficiently large 

P{{{Gn{t),Gn{t)):t> 0} convex) > P{An) > 1 - 5/10 > 1 - 6/2. 

To prove (ii), we split up the interval [0,Mo] in five different intervals 
Xi = [0,m - 7?i), X2 = [m -r]i,m), 13 = [m,M], I4 = (M,M + 772] and I5 = 
{M + ?72, Mq] and prove that for 1 < i < 5 

(A.13) P{Ci) = P{yt G Ii : C*n{t) < Gn,iit)) > 1 - VlO. 
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For t GX3, C*{t) = Gn,iit), hence (A. 13) holds triviahy. For the interval I2, 
we use that 

(A.14) Gn,i{u) - Gn,i{v) = (Gniu) - 

for some ^ G (depending on u and v). This gives 

P{ytel2:Gn,i{t)-C:{t)>0) 

= P(Vt G I2 : (Gnit) - - Ff^^"%m)) > 0) 

= P(Vt G I2 : - K'^^^rn) < 0) > P(A„) > 1 - 6/W. 

For X4, we can reason similarly. 

Now consider (A. 13) for i = l. For every t we have 

Gi{t) - Gi(m) - Fo{m){G{t) - G{m)) 
{Fo{m)-Fo{u))dG{u) 



> / {Fo{m)-Fo{u))dGiu). 

This means we have 

Gn,i{t)-c:{t) 

> Gn,i{t) - Gi{t) + Gi{m) - Gn,i{m) + Fo(m)(G(i) - Gn{t)) 

+ Fo(m)(G„(m) - G{m)) + (^^^(m) - Fo(m))(G„(m) - Gn{t)) 



+ / (Fo("i)-Fo(n))dG(n) 

m—r]i 

> -2\\Gn,i - Gilloo - 2\\Gn - G\\oo - 2|Fr™("i) - Fo{m)\ 

/•m 

+ / {Fo{m)-Foiu))dG{u). 



Jm-Tji 

By assumption (F.l), we have J^_^^(Fo(m) — Fq{u)) dG{u) > so (A. 13) 

follows for i = 1 by Lemma A. 2 and the pointwise consistency of F^^"° . 
For i = 5, the proof of (A. 13) is similar as for i = l. □ 

To prove the results in Section 4 and the results below, we use piece- 
wise constant versions of the functions iph,t and iph,t defined in (4.1). These 
functions are constant on the same intervals where the MLE F„ is con- 
stant. Denote these intervals by Jj = [rj, Tj+i) for < z < m — 1 (m < n and 
To = 0) and the piecewise constant versions of ^ph,t and iph,t by tph,t and 
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'^h,t- For u £ Ji these functions can be written as iph,t{'u) = il^{An{u)) and 
'fh,tiu) = ip{An{u)) for An{u) defined as 

r Ti, ifVtG Ji:Fo(t)>F„(Ti), 

(A.15) An{u) = ls, if3seJi:Fn{s)=Fo{s), 

[n+i, ifVtG Ji:Fo(t) <F„(ri), 

for u £ Ji, see also Figure 5. 

We first derive upper bounds for the distance between the function ^ph,t 
and its piecewise constant version iph^t and between iph^t and (ph,t- 

Lemma A. 4. Let t > be such that /o is positive and continuous in 
a neighborhood of t. Then there exists constants ci,C2 > such that for n 
sufficiently large 

(A.16) \^h,t{'^) - ^h,t{'^)\ < j^lFniu) - i^oM|l{|t-n|</i}, 

(A.17) \<fh,t{'^) - V^h,t{u)\ < j^\Fn{u) - Fo{u)\l{lt_u\<h}- 

Proof. For n sufficiently large, we have for all s G Xt = [t — h,t + h] that 
/o(s) ^ ^/o(0- Fix u£lt, then the interval Jj it belongs to is of one of the 
following three types: 

(i) Fo{x) > Fnin) for all x € Ji. 

(ii) Fq{x) = Fn{x) for some x £ Ji. 

(iii) Fq{x) < Fn{Ti) for all x € Ji. 

First, we consider the situation where Fn{u) = Fq[u). Then by definition of 

i^h,t{u) = ilJh,t{u), 
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SO that both the left- and the right-hand side of (A. 16) are equal to zero, and 
the upper bound holds. Note that for each F„(u) = Fq{u) implies An{u) = u, 
because Fq is strictly increasing near t. 

Now, we consider the situation where Fn{u) ^ Fq{u). For t;,^ G Jj, we get 
by using a Taylor expansion 

\Fn{u) - FQ{u)\ = \Fn{v) - Fq{u)\ 

= \Fn{v)-F^{v)-{u-v)Ui)\. 

Now, we have three posibilities. If A„(n) =Tj, then we have that FQ{Ti) — 
Fnin) > giving that 

\Fn{u) - F^{u)\ = \Fn{Ti) - Fo(r,) -{u- Ti)Ui)\ 

= \{u-Ti)Ui) + Fo{n)-Fn{Ti)\ 

>\U-Ti\h{i). 

If An{u) = V for some v Ji, then we have that Fn{v) = Fo{v), so that 

|F„(n) - Fo{u)\ = \Fr,{v) - Fo{u)\ = \Fn{v) - Fo{v) - {u - v)fo{0\ 
= \u-v\foiO- 

If An{u) = Tj+i, then we have F„(rj+i— ) — Fo(ri+i) > giving that 
\Fn{u) - Fo{u)\ = |F„(t,+i-) - Foin+i) - (n - r,+i)/o(OI 

= |(r,+i - U)fo{0 + Fn{Ti+l-) - Fo{Ti+i)\ 
> \Ti+l -u\fQ{C). 

For V G [rj,ri+i], this gives 

\Fn{u) - Fo(n)| >\u- v\h{i) > i/o(t)|n -v\>0. 
Since it also holds that 

li'hA'^) - ^h,t{^)\ = \^h,t{^) - ^h,t{'^)\ ^ Ch~'^\v - u\, 
\'fh,t{^) - 'Ph,t{^)\ = \^h,t{^) - 'Ph,t{'^)\ ^ Ch~^\v - U\ 

the upper bound in (A. 16) holds if ci = 2c//o(t) and the upper bound in 
(A.17) holds if C2 = 2c//o(t). □ 

To derive the asymptotic distribution of F^^(t) we need a result on the 
characterization of F„ and some results from empirical process theory, stated 
in Lemmas A. 5 and A. 7 below. 
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Lemma A. 5. For every right continuous piecewise constant function (p 
with only jumps at the points ti, . . . ,Tm, 

^{u)i6-Fniu))dFniu,6) = 0. 

Proof. By the convex minorant interpretation of F„, we have that 
/ 6dFn{u,6)= I Fn{u)dFniu,6) 

JlTi,Ti+l)x{0,l} J[Ti,Ti + l)x{0,l} 

for all < i < m — 1 (with tq = 0). This implies that 

ip{u){6-Fn{u))dFniu,6) 



-.+i)x{0,l} 



Hence, 



^^in) I {6-Fn{u))dFn{u,6)=0. 

i,Ti + l)x{0,l} 



^iu)i6-Fniu))dFn{u,5) 



m—1 „ 

V / ^{u){5-Fn{u))dFn{u,6) = 

—I J[n,n+i)x{o,i} □ 



Before we state the results on empirical process theory, we give some 
definitions and Theorem 2.14.1 in van der Vaart and Wellner (1996) needed 
for the proof of Lemma A. 7. 

Let F be the class of functions on IR+ and L2{Q) the L2-norm defined by 
a probability measure Q on M+, i.e., for g & F 

1/2 



L2iQ)[g] = \\g\\Q,2=(^j^ \g\dQ^ 



For any probability measure Q, let N{e,F,L2{Q)) be the minimal number 
of balls {g & F : \\g — /||q,2 < e} of radius e needed to cover the class F. The 
entropy H[£,F, L2{Q)) of F is then defined as 

H{e, F, L2 (Q)) = log N{e, F, L2{Q)) 

and J{5,F) is defined as 

J{5,F)= sup ^l + H{e,F,L2{Q))de. 

Q Jo 

An envelope function of a function class F on IR+ is any function F such 
that 1/(2;) I < F{x) for ah 2; G ]R+ and / G F. 
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Theorem A. 6 [Theorem 2.14.1 in van der Vaart and Wellner (1996)]. 
Let Pq be the distribution of the observable vector Z and be a Pq -measurable 
class of measurable functions with measurable envelope function F. Then 



E sup 



J fdV^{Fn-Po 



< J(l,J-)||F||p„,2, 



where < means < up to a multiplicative constant. 

Lemma A. 7. Assume Fq and G satisfy conditions (F.l) and (G.l) and 
let h: [0,oo) x {0,1} — [— 1,1] be defined as h{u,6) = Fq{u) — 5. Then for 
a < 1/5 and n — )• oo 



(A.18) Rn = n^'' J i^h,t{u){Fn{u)-Fo{u))d{Qn-G){u) = Op{l), 
(A.19) Sn = n^" / {4,i(n) - 6) d{¥n - Po){u, 6) = Op{l). 



Proof. Define It = [t — Vjt + u] for some > and note that by (2.5) 
and (2.6) for any r/ > we can find Mi, M2 > such that for all n sufficiently 
large 

P(.£i,n.Ah ): = P(snp\Fn{u)- Fo (u) I < Min"^/3 \ 

(A.20) 

> 1 - r//2, 

P{£2,n,M2) ■■ = P(snp\Aniu) - u\ < M2n~^/^logn) 

(A.21) 

> 1 -r//2- 

Also note that ||/i||oo ^ 1- Moreover, denote by A the class of monotone 
functions on X^, with values in [0, 2t]. Then we know, see, e.g., (2.5) in 
van de Geer (2000), that for all 5 > 

for any probability measure Q. For the same reason, the class Bm of func- 
tions of bounded variation on [0, 2i], absolutely bounded by M, has entropy 
function of the same order: 

H{b,BMMiCi))<^'^ foralU>0. 

Let us now start the main argument. Choose 77 > and Mi, M2 > related 
to (A.20) and (A.21), correspondingly. Let v\^n^'^2,n be vanishing sequences 
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of positive numbers and write 

P([|i?„| >i/i,n]) = P([|i?„| >zyi,„]n£:i,„,Mi) + i^([|iin| >l^l,n]nff^„^Mi) 

P{[\Sn\ > iy2,n]) < P{[\Sn\ > Z^2,n] H <52,n,Af2 ) + r,/2 < Z^a^^El S„| l^^.n.A/, + V/^- 

Here, we use the Markov inequality, (A. 20) and (A. 21). We now concentrate 
on the terms i^{^^E|Pn|l£i,„,Mi and z^^^^EjS'nll^rj . We show that if we 
take, e.g., = en~^*(logn)^ for /3i = 5/6 — 7a/2 and /32 = 5/6 — 4a and any 
e > these terms wih be smaUer than r] /2 for ah n sufficiently large, showing 
that Rn = C'p(n-^i(logn)2) = Op{l) and Sn = ©^(^"^^(logn)^) = Op{l) for 
a < 1/5. 

We start with some definitions. Define for 

fc(n"(^-n)/c) ^ 
cg{u) 

the functions CA,B,n and Cs.n by 
a,B,n(u) = C„(^(n))P(u), 

CB,n(u, S) = ni/3-"(logn)-i/i(n, 5)(a(n-i/3p(^x) logn + u) - C„(n)) 
and let 

Ql,n = {UB,n ■■AeA,Be BmA, G2,n = {Cfi.n : ^ G ^A/J. 

Note that by condition (K.l) \Cn{u) — Cn{v)\ < n°'p\u — v\ for all u,v £lt 
and some constant p > depending only on the kernel k, the point t and 
the constant c. Also note that both classes Gi^n and G2,n have a constant 
times li( as envelope function, where the constant pi only depend on k, t, 
c and Mi, i = 1,2. For Ki^n = n^"~^/^logn and K2,n = n^"~^/^logn, we now 
have that 

E|-Rn|l£:i,„.Mi 



< E sup 

< Kl,nE sup 



n2°-i/3log?i y V(^W)^Wrf(G„-G)(u) 
e(n) dV^(G„-G)(n 



and 

E|'S'n|l£-2,„,M2 

< E sup 



BgB 



A/2 



n 



20-1/2 



(5) 
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< E sup K2,n 



X logn + u) -1p{u)}d^/^{Fn- Po){u,6) 



au,5)d^iFn-Po){u,6) 



To bound these expectations, we use Theorem A. 6. Using the entropy 
results for A and Bm together with smoothness properties, we bound the 
entropies of the classes Qi^n and Q2,n- Therefore, we fix an arbitrary proba- 
bility measure Q and 5 > 0. 

We start with the entropy of Gi^n- Select a minimal n~°5/(2pMi)-net 
Ai, . . . ,An^ in A and a minimal (5/(2||C„ ||oo)-net Bi, B2, ■ ■ ■ , Biy^ in Bj[i^ 
and construct the subset of Qi^n consisting of the functions (,Ai,Bj,n corre- 
sponding to these nets. The number of functions in this net is then given 
by 

NANB=eMH{n-''6/{2pMi),A,L2{Q)) + H{6/{2\\Cn\\oo),l3M„L2{Q))) 
<exp(Cn°/(^), 

where C > is a constant. This set is a 6-net in Gi^n- Indeed, choose a 
C = £,A,B,n G Gi,n and denote the closest function to A in the ^-net by Ai 
and similarly the function in the ;Bjvfj-net closest to B by Bj. Then 

ll'^A.B.n — (,Ai,Bj,n\\Q,2 

< ||C„,||oo||S(-) - B,i-)\\Q^2+Mi\\Cn{A{-)) - CniA{-))\\Q,2 

< 5/2 + Mipn'^WAi - A\\q^2 < S. 
This implies that 

i^(5,ai,n,^2(Q))<n75 

and 

Ji6, Qi,n) < ^l + H{e,gi,n,L2{Q))de < n^^^Vd. 

To bound the entropy of ^2,n) we select a minimal {6/p)-net Bi,B2, ■ ■ ■ , B^ 
in and construct the subset of Q2,n consisting of the functions CBi,n cor- 
responding to this net. The number of functions in this net is then given 

by 



N = exp{H{6/p,BM„L2{Q))) <exp{C/5), 
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where C > is a constant. This set is a 6-net in Q2,n- Indeed, choose a 
C = C-B,n G Q2,n and denote the closest function to B in the ;SM2-net by Bi, 
then 

\\CB,n - CBi,n\\L2{Q) 

<ni/3-"(logn)-i||/i|U 

X \\Cn{n~^^''B{-) logn + •) - C„(n-V3^,(-) ^^^^^ .^y^^^^^ 

< ni/3-"(log n)-inVn-i/3 log n||B, - BU^^q^ < 6. 
This imphes that 

H{6,g2,n,L2{Q))<l/6 and J(5, ^2,™) < ^5. 
We now obtain via Theorem A. 6 that 



E\Rn\l£^„j,j <Kl,nE SUp 



<^i,nJ(l,ai,n)<n^"/'-'/'logn, 



E|5'„|l£-2„_j,, < K2,n-E sup 



<«2,„J(l,a2,n)<n^"-'/'log7^. 

Hence, we can take = en~'^^(logn)^ for /3i = 5/6 — 7a/2, (32 = 5/6 — 4a 
and any e > to conclude that 



P 



P 



n 



/3i 



(logn)2 



|i?n| >e < 



e(logn)2^'""'^^^'"'^^i 



-i?|ii„|l£:^ . 



eiogn 



+ r//2<r/. 



|5n|>e < 



/32 



(logn)2 J e(logn)2 

for n sufficiently large. □ 



E\Sn\ls.^.,+vl2< 



1 



elogn 



+ r]/2<7] 



With this lemma, we now can prove Theorem 4.2. 

Proof of Theorem 4.2. Using the piecewise contant version ^(lh,t of 
iph,tj we can write 

J xPhA^){6-Fn{u))dPo{u,6) = J iPh,t{u){5-Fn{u))dPo{u,5) + Rn, 

where for h = cn~" and n sufficiently large 



\Rn\ < Clh~ 



G[t-h,t+h] 



\Fo{u) - F„(n)|2 dG{u) = Op(n"-2/3) = 0^(^-2°) 
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by (2.3) and Lemma A. 4. So we find 

= n'" j i'hA^)i^ - ^o(u)) d{Po - Pn)(n, 6) + Op{l) 
using that n2°i?„ = Op{l), Property A.5 and (A.18). By (A.19), we get 
n^" J i;h,t {u){5- Fo (u) ) d{Po -Fn){u,6) 

= n'" J ^h,t{u){5 - Fo{u)) d{Po -Fn){u,6) + Op{l). 
Applying the central limit theorem with a = 1/5, gives 

n^/^y ^^,t(n)(<5- Fo(n))d(P„ - Po){u,6) ~-AA(0,4sm) 

for (7pg^ as in (4.4). Note that now 
n'/'{F^^{t)-Foit)) 

= r?'"" I i^hAu){S - Fo{u)) d{Fn - Po){u,S) 

+ n2/5(^y" Kh{t-u)dFo{u)-Fo{t)^ AA(/iir,sM, 4,sm)- 

To find our optimal bandwidth /in,opt > we minimize the aMSE with respect 
to c 

aMSE(F,s^,c) = -/ml{k)m' + ,-i -^o(t)(l - Fo(t)) J 
which is standard a minimization in c, yielding (4.5). □ 
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